Happy Git and GitHub for the useR
Jenny Bryan, the STAT 545 TAs, Jim Hester
5/26/2021
Jenny Bryan, the STAT 545 TAs, Jim Hester
The life of your software is recorded from the beginning.
at any moment you can revert to a previous revision
the history is browseable, you can inspect any revision
all the deleted content remains accessible in the history
You may have multiple variants of the same software, materialized as branches, for example:
Git will help you to:
| Windows | macOS | Linux |
|---|---|---|
| Git for Windows | xcode-select --install |
sudo apt-get install git (Ubuntu or Debian) sudo yum install git (Fedora or RedHat) |
C:/Program Files and this appears to be the default. Unless you have specific reasons to otherwise, follow this convention.In the shell:
git config --global user.name 'Jane Doe' git config --global user.email 'jane@example.com'
substituting your name and the email associated with your GitHub account.
Git is a version control system whose original purpose was to help developers work collaboratively on big software projects. Git manages the evolution of a set of files called a repository (repo).
For new or existing projects, we recommend that you:
How often and when should I do that?
Should I then be afraid of this new folder linked to Git ?
The daily workflow is probably not dramatically different from what you do currently. You work in the usual way, writing R scripts or authoring reports in LaTeX or R Markdown. But instead of only saving individual files, periodically you make a commit, which takes a snapshot of all the files in the entire project.
Chances are you are already into Git practices without knowing it
| Task | Purpose | Most researchers’ approach | Git approach |
|---|---|---|---|
| Versioning | You wrote a file which is at a version that is significant to you and that you might want to inspect or revert to later. Or you are reviewing a version of someone else’s document. | Append your initials and the date at the end of the current file name. | Make a commit. |
| Backuping, Sharing | You optimized and tested exhaustively your code. You want to release it. It is a so important version that you want to keep it for archive and make sure it can never accidentally be lost. | In addition to a saved copy on your computer, save it also on an external hard drive, a cloud such as CNRS Seafile or UNCLOUD. | Periodically push commits to GitHub. |
Tags. You can also designate certain snapshots as special with a tag, which is a name of your choosing. In a software project, it is typical to tag a release with its version, e.g., “v1.0.3”. For a manuscript or analytical project, you might tag the version submitted to a journal or transmitted to external collaborators.
git init myrepository # Creates the directory myrepository and make it a Git repo git init # Makes the current directory a Git repo
This command creates the directory myrepository.
myrepository/.git/myrepository/git init macs
Initialized empty Git repository in /Users/stamm-a/Softs/git-workshop/macs/.git/
ls -a macs
. .. .git
ls macs/.git/
HEAD config description hooks info objects refs
cd macs git status # show the status of the index and working copy
On branch master No commits yet nothing to commit (create/copy files and use "git add" to track)
echo 'Hello World!' > hello git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
hello
nothing added to commit but untracked files present (use "git add" to track)
git add files # Copy files into the index git commit [-m message] # Commits the content of the index
git add hello git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: hello
git commit -m "added file 'hello'"
[master (root-commit) b82c5b4] added file 'hello' 1 file changed, 1 insertion(+) create mode 100644 hello
git status
On branch master nothing to commit, working tree clean
echo "Happy" >> hello git commit
On branch master
Changes not staged for commit:
modified: hello
no changes added to commit
git add hello git commit -m "added Happy to hello content"
[master a632810] added Happy to hello content 1 file changed, 1 insertion(+)
git rm file # remove the file from the index and from the working copy git commit # commit the index
git rm hello
rm 'hello'
git commit -m "removed hello"
[master 2d96b5b] removed hello 1 file changed, 2 deletions(-) delete mode 100644 hello
git diff [rev_a [rev_b]] [-- path ...]
Shows the differences between two revisions rev_a and rev_b. By default: - rev_a is the index, - rev_b is the working copy.
git diff --staged [rev_a] [-- path ...]
Shows the differences between rev_a and the index. By default: - rev_a is HEAD (symbolic reference to last commit).
git diff and the indexecho foo >> hello git add hello echo bar >> hello
git diff
diff --git a/hello b/hello index 257cc56..3bd1f0e 100644 --- a/hello +++ b/hello @@ -1 +1,2 @@ foo +bar
git diff --staged
diff --git a/hello b/hello new file mode 100644 index 0000000..257cc56 --- /dev/null +++ b/hello @@ -0,0 +1 @@ +foo
git diff HEAD
diff --git a/hello b/hello new file mode 100644 index 0000000..3bd1f0e --- /dev/null +++ b/hello @@ -0,0 +1,2 @@ +foo +bar
git reset [--hard] [-- path ...]
git reset drops the changes staged into the index (restores files as they were in last commit),git reset --hard drops all the changes in the index and in the working copy.git checkout -- path
This command restores a file (or directory) as it appears in the index (thus it drops all unstaged changes).
git diff HEAD
diff --git a/hello b/hello new file mode 100644 index 0000000..3bd1f0e --- /dev/null +++ b/hello @@ -0,0 +1,2 @@ +foo +bar
git checkout -- . git diff HEAD
diff --git a/hello b/hello new file mode 100644 index 0000000..257cc56 --- /dev/null +++ b/hello @@ -0,0 +1 @@ +foo
git log
commit 2d96b5bf2d031af0e8e7d63beed256d113c4e44c
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date: Mon May 31 10:48:04 2021 +0200
removed hello
commit a632810446f7dfa44b5cafc66662bf1e9c4e790c
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date: Mon May 31 10:48:04 2021 +0200
added Happy to hello content
commit b82c5b4c6680fc06c8bdf61f7566c950c0381d81
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date: Mon May 31 10:48:04 2021 +0200
added file 'hello'
Commit details:
git show
commit 2d96b5bf2d031af0e8e7d63beed256d113c4e44c
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date: Mon May 31 10:48:04 2021 +0200
removed hello
diff --git a/hello b/hello
deleted file mode 100644
index ebc17f3..0000000
--- a/hello
+++ /dev/null
@@ -1,2 +0,0 @@
-Hello World!
-Happy
git mv # move or rename a file git tag # create or delete tags
Which do you prefer ?
GitKraken is a free, powerful Git(Hub) client. It’s especially exciting because it works on Windows, macOS, and Linux. This is great news, especially for long-suffering Linux users who have previously had very few options.
SourceTree is an alternative free client for Windows users.
GitHub offers a free Git(Hub) client, GitHub Desktop, for Windows and macOS. GitHub Desktop is aimed at beginners who want the most useful features of Git front and center.
DEMO
Go to https://github.com and make sure you are logged in.
Click green “New repository” button. Or, if you are on your own profile page, click on “Repositories”, then click the green “New” button.
myrepo (or whatever you wish)Click the big green button “Create repository.”
Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button. Or copy the SSH URL if you chose to set up SSH keys.
Refer to (https://happygitwithr.com/new-github-first.html#make-a-repo-on-github-2).
It is likely that your first push leads to a challenge for your GitHub username and password. This will drive you crazy in the long-run and make you reluctant to push. You want to eliminate this annoyance.
You should set up once and for all a Personal Access Token (PAT).
Use it as password the first time a Git command prompts you for your credentials.
GitHub will no longer bother you with credentials after that.
Main recommendations
DEMO
We create a new project, with the preferred “GitHub first, then RStudio” sequence. Why do we prefer this? Because this method of copying the Project from GitHub to your computer also sets up the local Git repository for immediate pulling and pushing. In the absence of other constraints, I suggest that all of your R projects have exactly this set-up.
This is the main approach if you already have a local existing project that you want to bring on GitHub.
An explicit workflow for connecting an existing local R project to GitHub, when for some reason you cannot or don’t want to do a “GitHub first” workflow. When does this come up? Example: it’s an existing project that is already a Git repo with a history you care about. Then you have to do this properly.
Connect a local directory to the remote team repo https://github.com/astamm/macs.git
| Team Member | External Collaborator |
|---|---|
git clone https://github.com/astamm/macs.git some_local_folder |
Forking process |
From the master branch, retrieve last modifications made to master branch via git pull;
From the master branch, create another branch my_awesome_feature for implementing your brand new feature in the software via git checkout -b my_awesome_feature;
Write down your code locally;
Stage (git add to transfer to staging area) edited files;
Commit (git commit [-m msg] to register staged work);
Push (git push to send to remote branch);
Rinse and repeat from Step 4.
This is the recommended workflow when you are already developing a new feature in a branch.
master branch via git checkout master;master branch, retrieve last modifications made to master branch via git pull;my_awesome_feature branch, retrieve last modifications made to master branch via git merge master;git add to transfer to staging area) edited files;git commit [-m msg] to register staged work);git push to send to remote branch);When you are satisfied with your implementation of the feature:
master branch;git pull to make sure your local master matches the updated remote master.When you are not part of the team, you can still contribute by forking the remote repo from GitHub.
Forking makes a copy of the master remote into your GitHub account under the same repo name;
When you git clone it, the remote master origin/master points to the forked master, which is why you can then git push to it;
The forked master should never be modified;
If you want to stay tuned (keep track) with the latest changes in original master, you need to manually add a remote via get remote add. By convention, the original repo should be called upstream;
git remote add upstream https://github.com/OWNER/REPO.git
Now you can stay up to date whenever you want by going to your master branch and execute
git pull upstream master
Then you will git push to save the modifications also on your forked master.
Finally you can switch back to your branch via git checkout my_awesome_branch and git merge master to make also your branch up to date.
Remember that the origin master branch is the one used by the world to install your software.
Four rules to adopt that will make your life easy
In other words:
Do this once per new project.
Go to https://github.com and make sure you are logged in.
Click green “New repository” button. Or, if you are on your own profile page, click on “Repositories”, then click the green “New” button.
Click the big green button “Create repository.”
Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button. Or copy the SSH URL if you chose to set up SSH keys.
In a Bash terminal, type
git clone paste_from_clipboard folder_you_want_to_clone_into/
If you want to experiment team work, I can set up team GitHub projects and add you as members.
.Rprofile setup with use_devtools() and edit_r_profile()create_package() with automatic setup of roxygen for handling documentationuse_XXX_license()use_git()use_github()use_package()use_data()use_package_doc()use_readme_rmd()use_news_md()use_vignette()use_testthat() and use_test()devtools::test()DEMO